Restructuring Gaussian mixture density functions in speaker-independent acoustic models
نویسنده
چکیده
In continuous speech recognition featuring hidden Markov model (HMM), word N-gram and time-synchronous beam search, a local modeling mismatch in the HMM will often cause the recognition performance to degrade. To cope with this problem, this paper proposes a method of restructuring Gaussian mixture pdfs in a pre-trained speaker-independent HMM based on speech data. In this method, mixture components are copied and shared among multiple mixture pdfs with the tendency of local errors taken into account. The tendency is given by comparing the pre-trained HMM and speech data which was used in the pre-training. Experimental results prove that the proposed method can effectively restore local modeling mismatches and improve the recognition performance.
منابع مشابه
Principal mixture speaker adaptation for improved continuous speech recognition
Nowadays, almost all speaker-independent (SI) speech recognition systems use CDHMM with multivariate mixture Gaussian as observation density to cover speaker variabilities. It has been shown that given sufficient training data, the more mixtures are used in the HMM observation density, the better the system’s perform. However, acoustic HMM with more Gaussian densities is more complex and slows ...
متن کاملSpeaker independent acoustic modeling using speaker normalization
This paper proposes a novel speaker-independent (SI) modeling for spontaneous speech data from multiple speakers. The SI acoustic model parameters are estimated by individual training for inter-speaker variability and for intraspeaker phonetically related variation in order to obtain a more accurate acoustic model. The linear transformation technique is used for the speaker normalization to ext...
متن کاملPhonetic, idiolectal and acoustic speaker recognition
This paper describes a text-independent speaker recognition system that achieves an equal error rate of less than 1% by combining phonetic, idiolect, and acoustic features. The phonetic system is a novel language-independent speakerrecognition system based on differences among speakers in dynamic realization of phonetic features (i.e., pronunciation), rather than spectral differences in voice q...
متن کاملSpeaker adaptation using tree structured shared-state HMMs
This paper proposes a novel speaker adaptation method that exibly controls state-sharing of HMMs according to the amount of adaptation data. In our scheme, acoustic modeling is combined with adaptation to e ciently utilize the acoustic models sharing characteristics for adaptation. The shared-state set of HMMs is determined by using tree-structured shared-state HMMs created from the history rec...
متن کاملText Independent Speaker Identification Using Automatic Acoustic Segmentation
This paper describes an acoustic class dependent technique for text independent speaker identification on very short utterances. The technique is based on maximum likelihood estimation of a Gaussian mixture model representation of speaker identity. Gaussian mixtures are noted for their robustness as a parametric model and their ability to form smooth estimates of rather arbitrary underlying den...
متن کامل